Document Clustering with Relational Graph Of Common Phrase and Suffix Tree Document Model
نویسندگان
چکیده
منابع مشابه
Phrase based Clustering Scheme of Suffix Tree Document Clustering Model
Document clustering is one of the difficult and recent research fields in the search engine research. Most of the existing documents clustering techniques use a group of keywords from each document to cluster the documents. Document clustering arises from information retrieval domains, and “It finds grouping for a set of documents belonging to the same cluster are similar and documents belongs ...
متن کاملThe Suffix Tree Document Model Revisited
In text-based information retrieval, which is the predominant retrieval task at present, several document models have been proposed, such as boolean, probabilistic, or (extended) vector models [Baeza-Yates and Ribeiro-Neto 1999]. Interestingly, the suffix tree document model is usually not discussed in the literature on the subject though it comes along with a property that sets it apart from t...
متن کاملPhrase Clustering Without Document Context
We applied different clustering algorithms to the task of clustering multi-word terms in order to reflect a humanly built ontology. Clustering was done without the usual document co-occurrence information. Our clustering algorithm, CPCL (Classification by Preferential Clustered Link) is based on general lexico-syntactic relations which do not require prior domain knowledge or the existence of a...
متن کاملDocument Clustering with K-tree
This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising resu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Journal of the Korea Contents Association
سال: 2009
ISSN: 1598-4877
DOI: 10.5392/jkca.2009.9.2.142